Search Result

Select

GTC： geo-distributed triangle counting algorithm in graph streams

Chunze CAO, Delong MA, Ye YUAN

Journal of Computer Applications 2023, 43 (7): 2040-2048. DOI: 10.11772/j.issn.1001-9081.2022071130

Abstract （148）

HTML （3）

PDF （1947KB）（159）

Save

The existing distributed triangle counting algorithms assume that all computing nodes are in the same location， while in reality， the nodes may be located in multiple data centers across continents. Geo-distributed data centers which are connected with wide area networks have characteristics of heterogeneous network bandwidth， high communication cost and uneven distribution， and the existing distributed algorithms cannot be applied to geo-distributed environment. At the same time， the existing research which ignores the temporal locality of the formation of triangles mostly adopts strategies such as random sampling and elimination of edges. Therefore， the triangle counting problem of real graph streams in geo-distributed environment was studied， and a Geo-distributed Triangle Counting （GTC） algorithm was proposed. Firstly， aiming at the problem of too high data transmission caused by the existing edge distribution strategy， a geo-distributed edge distribution strategy was proposed to build a benefit formula combining the time benefit and data benefit of communication and use point-to-point communication to replace broadcast edges. Then， for the triangle repeated counting problem caused by point-to-point communication in geo-distributed environment， a final edge calculation rule was proposed to ensure no counting repetition. Finally， based on the time weighted sampling algorithm， a time-weighted triangle counting algorithm was proposed to use the time locality of the triangle to sample. The GTC was compared with Conditional Counting and Sampling （CoCoS） and Tri-Fly on five real graph streams. The results show that GTC has the communication data size decreased by 17% compared to CoCoS and decreased by 44% compared to Tri-Fly， GTC has the error rate decreased by 53% compared to Tri-Fly and slightly less than CoCoS， and GTC has the running time decreased by 34% compared to Tri-Fly and slightly more than CoCoS. It can be seen that the GTC can reduce the size of communication data effectively while ensuring high accuracy and short algorithm running time.

Table and Figures | Reference | Related Articles | Metrics

Select

Unsupervised attributed graph embedding model based on node similarity

Yang LI, Anbiao WU, Ye YUAN, Linlin ZHAO, Guoren WANG

Journal of Computer Applications 2022, 42 (1): 1-8. DOI: 10.11772/j.issn.1001-9081.2021071221

Abstract （810）

HTML （127）

PDF （864KB）（539）

Save

Attributed graph embedding aims to represent the nodes in an attributed graph into low-dimensional vectors while preserving the topology information and attribute information of the nodes. There are lots of works related to attributed graph embedding. However， most of algorithms proposed in them are supervised or semi-supervised. In practical applications， the number of nodes that need to be labeled is large， which makes these algorithms difficult and consume huge manpower and material resources. Above problems were reanalyzed from an unsupervised perspective， and an unsupervised attributed graph embedding algorithm was proposed. Firstly， the topology information and attribute information of the nodes were calculated respectively by using the existing non-attributed graph embedding algorithm and attributes of the attributed graph. Then， the embedding vector of the nodes was obtained by using Graph Convolutional Network （GCN）， and the difference between the embedding vector and the topology information and the difference between the embedding vector and attribute information were minimized. Finally， similar embeddings was obtained by the paired nodes with similar topological information and attribute information. Compared with Graph Auto-Encoder （GAE） method， the proposed method has the node classification accuracy improved by 1.2 percentage points and 2.4 percentage points on Cora and Citeseer datasets respectively. Experimental results show that the proposed method can effectively improve the quality of the generated embedding.

Table and Figures | Reference | Related Articles | Metrics

Select

Rapid stable detection of human faces in image sequence based on MS-KCF model

YE Yuanzheng, LI Xiaoxia, LI Minze

Journal of Computer Applications 2018, 38 (8): 2192-2197. DOI: 10.11772/j.issn.1001-9081.2018020363

Abstract （701）

PDF （1139KB）（594）

Save

In order to quickly and stably detect the faces with large change of angle and serious occlusion in image sequence, a new automatic Detection-Tracking-Detection (DTD) model was proposed by combining the fast and accurate target detection model MobileNet-SSD (MS) and the fast tracking model Kernel Correlation Filtering (KCF), namely MS-KCF face detection model. Firstly, the face was detected quickly and accurately by using MS model, and the tracking model was updated. Secondly, the detected face coordinate information was input into the KCF tracking model to track steadily, and the overall detection speed was accelerated. Finally, to prevent tracking loss, the detection model was updated again after tracking several frames, then the face was detected again. The recall of MS-KCF model in the FDDB face detection benchmark was 93.60%; the recall in Easy, Medium and Hard data sets of WIDER FACE benchmark were 93.11%, 92.18% and 82.97%, respectively; the average speed was 193 frames per second. Experimental results show that the MS-KCF model is stable and fast, which has a good detection effect on the faces with serious shadows and large angle changes.

Reference | Related Articles | Metrics

Select

Pattern match queries oriented to uncertain planar graphs

Guo-ren WANG Ye YUAN Xi-jia ZHANG

Journal of Computer Applications 2011, 31 (04): 874-881.

Abstract （1240）

PDF （1231KB）（414）

Save

Pattern match search oriented to planar graphs is widely used in biological networks, social networks, fingerprint identification and image segmentation. Meanwhile, data extracted from those applications is inherently uncertain due to noise, incompleteness and inaccuracy, and query processing techniques of certain planar graphs cannot be applied to uncertain graph data. Therefore, in this paper, the pattern match query oriented to uncertain planar graphs was studied under the probabilistic semantics. Specifically, Uncertain Pattern Match (UPM) queries using the possible world semantics were investigated. Firstly, to avoid enumerating all possible worlds, a basic deterministic algorithm that can exactly compute UPM query was proposed. To further speed up the basic algorithm, an improved deterministic approach was developed based on tighten bounds. Secondly, a sampling algorithm that can quickly and accurately estimate matching probability was designed. The effectiveness of the proposed solutions for UPM queries was verified through extensive experiments on real uncertain planar graph datasets. Finally, UPM queries were applied to the segmentation on pulmonary CT images. The results show that the proposed approaches are superior to classical techniques of image segmentation.